AITopics | individual gradient

Collaborating Authors

individual gradient

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Practical Debugging Tool for the Training of Deep Neural Networks Supplementary Material Checklist

Neural Information Processing SystemsAug-16-2025, 19:20:51 GMT

Do the main claims made in the abstract and introduction accurately reflect the paper's Did you describe the limitations of your work? Did you discuss any potential negative societal impacts of your work? In general, we believe, this work will have an overall positive impact as it can help shed light into the black-box that is deep learning. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] All experimental results, as well as the complete code base to reproduce them can be Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)?

artificial intelligence, machine learning, ockpit, (20 more...)

Neural Information Processing Systems

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning

Wang, Zhibo, Chang, Zhiwei, Hu, Jiahui, Pang, Xiaoyi, Du, Jiacheng, Chen, Yongle, Ren, Kui

arXiv.org Artificial IntelligenceJun-22-2024

Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover individual clients' private labels. Specifically, we conduct a theoretical analysis of label inference from the aggregated gradients that are exclusively obtained after implementing SA. The analysis results reveal that the inputs (embeddings) and outputs (logits) of the final fully connected layer (FCL) contribute to gradient disaggregation and label restoration. To preset the embeddings and logits of FCL, we craft a fishing model by solely modifying the parameters of a single batch normalization (BN) layer in the original model. Distributing client-specific fishing models, the server can derive the individual gradients regarding the bias of FCL by resolving a linear system with expected embeddings and the aggregated gradients as coefficients. Then the labels of each client can be precisely computed based on preset logits and gradients of FCL's bias. Extensive experiments show that our attack achieves large-scale label recovery with 100\% accuracy on various datasets and model architectures.

aggregated gradient, gradient, server, (14 more...)

arXiv.org Artificial Intelligence

2406.15731

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

Schneider, Frank, Dangel, Felix, Hennig, Philipp

arXiv.org Machine LearningFeb-12-2021

When engineers train deep learning models, they are very much "flying blind". Commonly used approaches for real-time training diagnostics, such as monitoring the train/test loss, are limited. Assessing a network's training process solely through these performance indicators is akin to debugging software without access to internal states through a debugger. To address this, we present Cockpit, a collection of instruments that enable a closer look into the inner workings of a learning machine, and a more informative and meaningful status report for practitioners. It facilitates the identification of learning phases and failure modes, like ill-chosen hyperparameters. These instruments leverage novel higher-order information about the gradient distribution and curvature, which has only recently become efficiently accessible. We believe that such a debugging tool, which we open-source for PyTorch, represents an important step to improve troubleshooting the training process, reveal new insights, and help develop novel methods and heuristics.

gradient, ockpit, overhead, (17 more...)

arXiv.org Machine Learning

2102.06604

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Panama (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BackPACK: Packing more into backprop

Dangel, Felix, Kunstner, Frederik, Hennig, Philipp

arXiv.org Machine LearningDec-23-2019

Automatic differentiation frameworks are optimized for exactly one thing: computing the average mini-batch gradient. Y et, other quantities such as the variance of the mini-batch gradients or many approximations to the Hessian can, in theory, be computed efficiently, and at the same time as the gradient. While these quantities are of great interest to researchers and practitioners, current deep-learning software does not support their automatic calculation. Manually implementing them is burdensome, inefficient if done na ıvely, and the resulting code is rarely shared. This hampers progress in deep learning, and unnecessarily narrows research to focus on gradient descent and its variants; it also complicates replication studies and comparisons between newly developed methods that require those quantities, to the point of impossibility. Its capabilities are illustrated by benchmark reports for computing additional quantities on deep neural networks, and an example application by testing several recent curvature approximations for optimization. The success of deep learning and the applications it fuels can be traced to the popularization of automatic differentiation frameworks. However, this specialization also has its shortcomings: it assumes the user only wants to compute gradients or, more precisely, the average of gradients across a mini-batch of examples. Other quantities can also be computed with automatic differentiation at a comparable cost or minimal overhead to the gradient backpropagation pass; for example, approximate second-order information or the variance of gradients within the batch. These quantities are valuable to understand the geometry of deep neural networks, for the identification of free parameters, and to push the development of more efficient optimization algorithms. But researchers who want to investigate their use face a chicken-and-egg problem: automatic differentiation tools required to go beyond standard gradient methods are not available, but there is no incentive for their implementation in existing deep-learning software as long as no large portion of the users need it. Second-order methods for deep learning have been continuously investigated for decades (e.g., Becker & Le Cun, 1989; Amari, 1998; Bordes et al., 2009; Martens & Grosse, 2015).

approximation, gradient, individual gradient, (17 more...)

arXiv.org Machine Learning

1912.10985

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An elegant way to represent forward propagation and back propagation in a neural network

#artificialintelligenceAug-19-2019, 00:15:15 GMT

Sometimes, you see a diagram and it gives you an'aha ha' moment I saw it on Frederick kratzert's blog Using the input variables x and y, The forwardpass (left half of the figure) calculates output z as a function of x and y i.e. f(x,y) The right side of the figures shows the backwardpass. Receiving dL/dz (the derivative of the total loss with respect to the output z), we can calculate the individual gradients of x and y on the loss function by applying the chain rule, as shown in the figure. This post is a part of my forthcoming book on Mathematical foundations of Data Science. The goal of the neural network is to minimise the loss function for the whole network of neurons. Hence, the problem of solving equations represented by the neural network also becomes a problem of minimising the loss function for the entire network.

derivative, forward propagation and back propagation, neural network, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning Step Size Controllers for Robust Neural Network Training

Daniel, Christian (TU Darmstadt) | Taylor, Jonathan (Microsoft Research) | Nowozin, Sebastian (Microsoft Research)

AAAI ConferencesApr-19-2016

This paper investigates algorithms to automatically adapt the learning rate of neural networks (NNs). Starting with stochastic gradient descent, a large variety of learning methods has been proposed for the NN setting. However, these methods are usually sensitive to the initial learning rate which has to be chosen by the experimenter. We investigate several features and show how an adaptive controller can adjust the learning rate without prior knowledge of the learning problem at hand.

artificial intelligence, controller, machine learning, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Europe (0.28)

Genre:

Research Report (0.87)
Overview (0.54)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback